On principal components regression, random projections, and column subsampling
نویسندگان
چکیده
منابع مشابه
Nonparametric Principal Components Regression
In ordinary least squares regression, dimensionality is a sensitive issue. As the number of independent variables approaches the sample size, the least squares algorithm could easily fail, i.e., estimates are not unique or very unstable, (Draper and Smith, 1981). There are several problems usually encountered in modeling high dimensional data, including the difficulty of visualizing the data, s...
متن کاملKernel Principal Components Are Maximum Entropy Projections
Principal Component Analysis (PCA) is a very well known statistical tool. Kernel PCA is a nonlinear extension to PCA based on the kernel paradigm. In this paper we characterize the projections found by Kernel PCA from a information theoretic perspective. We prove that Kernel PCA provides optimum entropy projections in the input space when the Gaussian kernel is used for the mapping and a sample...
متن کاملLinear regression with random projections
We investigate a method for regression that makes use of a randomly generated subspace GP ⊂ F (of finite dimension P) of a given large (possibly infinite) dimensional function space F , for example, L2([0,1] d ;R). GP is defined as the span of P random features that are linear combinations of a basis functions of F weighted by random Gaussian i.i.d. coefficients. We show practical motivation fo...
متن کاملRandom projections for Bayesian regression
This article deals with random projections applied as a data reduction technique for Bayesian regression analysis. We show sufficient conditions under which the entire d-dimensional distribution is approximately preserved under random projections by reducing the number of data points from n to k ∈ O(poly(d/ε)) in the case n d. Under mild assumptions, we prove that evaluating a Gaussian likeliho...
متن کاملSparse principal component analysis via random projections
We introduce a new method for sparse principal component analysis, based on the aggregation of eigenvector information from carefully-selected random projections of the sample covariance matrix. Unlike most alternative approaches, our algorithm is non-iterative, so is not vulnerable to a bad choice of initialisation. Our theory provides great detail on the statistical and computational trade-of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronic Journal of Statistics
سال: 2018
ISSN: 1935-7524
DOI: 10.1214/18-ejs1486